Analysis of LSF frame selection in voice conversion
نویسندگان
چکیده
In practical applications of voice conversion, it is necessary to be able to cope with small amounts of speaker-specific training data. Consequently, most of the proposed voice conversion algorithms are based on probabilistic conversion functions. Recently, however, there has been increased interest in unit selection based approaches for voice conversion. It is evident that typical training sets are too small for enabling meaningful selection of large units such as diphones. But would it be possible to use smaller segments like frames for high quality results provided that the selection is handled very well? In this paper, we analyze the performance of the frame selection approach in ideal conditions. In the experiments, line spectral frequencies of test sentences are replaced with the best matches from different training sets. The results show that perceptually transparent quality cannot be achieved with realistic database sizes.
منابع مشابه
CELP Coder Modification for the Voice Conversion
Voice Conversion (VC) consists in modifying a source voice to a target speaker voice. In our approach, we modified only the Code excited linear Predictive (CELP) coder by introducing a pre-processing before the coder for the voice conversion. The decoder part of CELP was not modified. This allows maintaining the transmission rate. Our approach for conversion consists in separating the voiced an...
متن کاملAdding Glottal Source Information to Intra-Lingual Voice Conversion
This paper studies the inclusion of glottal source characteristics in voice conversion (VC) systems. We use source/filter decomposition to parametrize the vocal tract using LSF, the glottal source using the LF model, and the aspiration noise using amplitude-modulated high-pass filtered AWGN noise. To evaluate the impact of this new parametrization in VC, we use a reference conversion system tha...
متن کاملEmotional Speech Synthesis Based on Improved Codebook Mapping Voice Conversion
This paper presents a spectral transformation method for emotional speech synthesis based on voice conversion framework. Three emotions are studied, including anger, happiness and sadness. For the sake of high naturalness, superior speech quality and emotion expressiveness, our original STASC system is modified by introducing a new feature selection strategy and hierarchical codebook mapping pr...
متن کاملOn Residual Prediction in Voice Conversion Task
Nowadays, voice conversion is a problem which is intensively analyzed by many researchers. A large group of existing voice conversion systems is based on RELP re-synthesis. Within these systems, the speech signal is pitchsynchronously segmented and described with LSF parameters. A transformation function is acquired by employing pairs of equal time-aligned utterances from source and target spea...
متن کاملNew algorithm for LPC residual estimation from LSF vectors for a voice conversion system
Voice conversion involves transforming segments of speech from a source speaker to make them to be perceived as if spoken by a target speaker. Generally, this process involves the estimation of vocal tract parameters and an excitation signal that match the target speaker. The work presented here proposes an algorithm for estimating the excitation residuals of the target speaker using a weighted...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009